-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LoRA] Adds support for bias in LoRA #5733
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add an argument to the engine enable_lora_bias
and avoid initializing the bias tensors if it's false? If the user knows none of their loras will have bias, we can save memory.
@Yard1 Thanks for reviewing the PR. I have added the enable_lora_bias flag (default set to false), which prevents the allocation of lora bias tensors when false. |
Related: #5930 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, can we also add an e2e test?
@Yard1 Thanks for reviewing. I've added an e2e test for the lora_bias support. |
@followumesh you need to run |
@followumesh apologies, this needs another conflict resolution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all of the work on this @followumesh!
): | ||
self.reset_lora(index) | ||
|
||
if self.tp_size > 1: | ||
lora_a = self.slice_lora_a(lora_a) | ||
lora_b = self.slice_lora_b(lora_b) | ||
if bias is not None: | ||
bias = self.slice_bias(bias) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The typing looks wrong here ... bias
is a tensor but the method expects a list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the slice_bias mimics slice_lora_b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm OK fair enough, I guess the typing errors are preexisting.
self, bias: List[Union[torch.Tensor, | ||
None]]) -> List[Union[torch.Tensor, None]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to use tuple here instead of list. Clearer when there are a fixed number of elements and performance is generally better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, I have let it mimic slice_lora_b.
@njhill I have addressed your comments above. Can you please review this again? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @followumesh and sorry for the delay.
There's one remaining small but major thing to fix (and tests are failing due to this).
vllm/lora/models.py
Outdated
if not self.lora_config.bias_enabled: | ||
module_lora.bias = None | ||
raise ValueError( | ||
f"Adapter bias cannot be used for {module_name}" | ||
" without --enable-lora-bias.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look right and is causing blanket lora failures. I think it should be:
if not self.lora_config.bias_enabled: | |
module_lora.bias = None | |
raise ValueError( | |
f"Adapter bias cannot be used for {module_name}" | |
" without --enable-lora-bias.") | |
if module_lora.bias is not None and not self.lora_config.bias_enabled: | |
raise ValueError( | |
f"Adapter bias cannot be used for {module_name}" | |
" without --enable-lora-bias.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorporated the comment.
): | ||
self.reset_lora(index) | ||
|
||
if self.tp_size > 1: | ||
lora_a = self.slice_lora_a(lora_a) | ||
lora_b = self.slice_lora_b(lora_b) | ||
if bias is not None: | ||
bias = self.slice_bias(bias) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm OK fair enough, I guess the typing errors are preexisting.
Signed-off-by: Umesh Deshpande <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]>
Motivation
PEFT, https://github.com/foundation-model-stack/fms-hf-tuning includes support for tuning LoRA bias. This PR enables bias for lora, so the adapters with bias will work with vLLM.
Changes Included